Cascaded Grammatical Relation Assignment
نویسندگان
چکیده
In this paper we discuss cascaded MemoryBased grammatical relations assignment. In the first stages of the cascade, we find chunks of several types (NP,VP,ADJP,ADVP,PP) and label them with their adverbial function (e.g. local, temporal). In the last stage, we assign grammatical relations to pairs of chunks. We studied the effect of adding several levels to this cascaded classifier and we found that even the less performing chunkers enhanced the performance of the relation finder. 1 I n t r o d u c t i o n When dealing with large amounts of text, finding structure in sentences is often a useful preprocessing step. Traditionally, full parsing is used to find structure in sentences. However, full parsing is a complex task and often provides us with more information then we need. For many tasks detecting only shallow structures in a sentence in a fast and reliable way is to be preferred over full parsing. For example, in information retrieval it can be enough to find only simple NPs and VPs in a sentence, for information extraction we might also want to find relations between constituents as for example the subject and object of a verb. In this paper we discuss some Memory-Based (MB) shallow parsing techniques to find labeled chunks and grammatical relations in a sentence. Several MB modules have been developed in previous work, such as: a POS tagger (Daelemans et al., 1996), a chunker (Veenstra, 1998; Tjong Kim Sang and Veenstra, 1999) and a grammatical relation (GR) assigner (Buchholz, 1998). The questions we will answer in this paper are: Can we reuse these modules in a cascade of classifiers? What is the effect of cascading? Will errors at a lower level percolate to higher modules? Recently, many people have looked at cascaded and/or shallow parsing and GR assignment. Abney (1991) is one of the first who proposed to split up parsing into several cascades. He suggests to first find the chunks and then the dependecies between these chunks. Grefenstette (1996) describes a cascade of finite-state transducers, which first finds noun and verb groups, then their heads, and finally syntactic functions. Brants and Skut (1998) describe a partially automated annotation tool which constructs a complete parse of a sentence by recursively adding levels to the tree. (Collins, 1997; Ratnaparkhi, 1997) use cascaded processing for full parsing with good results. Argamon et al. (1998) applied Memory-Based Sequence Learning (MBSL) to NP chunking and subject/object identification. However, their subject and object finders are independent of their chunker (i.e. not cascaded). Drawing from this previous work we will explicitly study the effect of adding steps to the grammatical relations assignment cascade. Through experiments with cascading several classifiers, we will show that even using imperfect classifiers can improve overall performance of the cascaded classifier. We illustrate this claim on the task of finding grammatical relations (e.g. subject, object, locative) to verbs in text. The GR assigner uses several sources of information step by step such as several types of XP chunks (NP, VP, PP, ADJP and ADVP), and adverbial functions assigned to these chunks (e.g. temporal, local). Since not all of these entities are predicted reliably, it is the question whether each source leads to an improvement of the overall GR assignment. In the rest of this paper we will first briefly describe Memory-Based Learning in Section 2. In
منابع مشابه
Automatic Assignment of Grammatical Relations
This paper presents a method for the assignment of grammatical relation labels in a sentence structure. The method has been implemented in the software tool AGRA (Automatic Grammatical Relation Assigner), which is part of a project for the development of a treebank of Italian sentences, and a knowledge base of Italian subcategorization frames. The annotation schema implements a notion of unders...
متن کاملSearching treebanks for functional constraints: cross-lingual experiments in grammatical relation assignment
We report here on a detailed quantitative analysis of distributional language data of both Italian and Czech, highlighting the relative contribution of a number of distributed grammatical factors to sentence-based identification of subjects and direct objects. The work is based on a Maximum Entropy model of stochastic resolution of grammatical conflicting constraints, and is demonstrably capabl...
متن کاملGrammatical Gender and Inferences About Biological Properties in German-Speaking Children
In German, nouns are assigned to one of the three gender classes. For most animal names, however, the assignment is independent of the referent's biological sex. We examined whether German-speaking children understand this independence of grammar from semantics or whether they assume that grammatical gender is mapped onto biological sex when drawing inferences about sex-specific biological prop...
متن کاملBinding theory and grammatical specific language impairment in children.
This study investigates the intrasentential assignment of reference to pronouns (him, her) and anaphors (himself, herself) as characterized by Binding Theory in a subgroup of "Grammatical specifically language-impaired" (SLI) children. The study aims to (1) provide further insight into the underlying nature of Grammatical SLI in children and (2) elucidate the relationship between different sour...
متن کاملEliminating grammatical function assignment from hierarchical models of speech production: Evidence from the conceptual accessibility of referents
The assignment of grammatical functions has been a key feature of hierarchical (serial) models of speech production since their inception in the 1970s. This article argues that grammatical function assignment is neither sufficient nor necessary in such models. It reports a study of the effects of the conceptual accessibility of referents on the selection of English dative syntactic frames in pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره cs.CL/9906004 شماره
صفحات -
تاریخ انتشار 1999